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Abstract 



o . 

A 4D-Var data assimilation technique is applied to a ORCA-2 configuration of the NEMO in order 
■ to identify the optimal parametrization of the boundary conditions on the lateral boundaries as well 

' as on the bottom and on the surface of the ocean. The influence of the boundary conditions on the 

solution is analyzed as in the assimilation window and beyond the window. It is shown that optimal 
conditions for vertical operators allows to get stronger and finer jet streams (Gulf Stream, Kuroshio) 
in the solution. Analyzing the reasons of the jets reinforcement, we see that the major impact of 
the data assimilation is made on the parametrization of the bottom boundary conditions for lateral 
velocities u and v. 

\ Automatic generation of the tangent and adjoint codes is also discussed. Tapenade software is 

fi |. shown to be able to produce the adjoint code that can be used after a memory usage optimization. 

Keywords: Variational Data Assimilation; Boundary conditions; ORCA-2 model. 
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O ■ 1 Introduction 

Variational data assimilation technique, first proposed in |17j . |19j . is based on the optimal control 
methods [22] and perturbations theory [26] . This technique allows us to retrieve an optimal data for 
a given model from heterogeneous observation fields. Since the early 1990's, many mathematical and 
geophysical teams are involved in the development of the data assimilation strategy One can cite many 
papers devoted to this problem, as in the domain of development of different methods for the data 
assimilation and also in the domain of its applications to the atmosphere and oceans. 

In the beginning, data assimilation methods were intended to identify and reconstruct an optimal 
initial state of the model. However, the idea that other model's parameters should also be identified by 
data assimilation has also been studied and discussed in numerous papers. One can cite several examples 
^vq ' of using data assimilation to identify the bottom topography of simple models ([23], jS], QT]), to control 

open boundary conditions in coastal and regional models ([22], [33]' [35; [3S])> boundary conditions on 
rigid boundaries ([TS], [21] [2D]> [12]) [13]) [33]) an d to determine other parameters of a model ([37], [3D], 

E). 

It was pointed out in |29) that the problem of adjoint parameters identification is frequently ill-posed. 
This fact remains valid when boundary conditions are considered as a control parameter. It was shown in 
|12) that the presence of non-null kernel of the Hessian results in a non-unique choice of optimal boundary 
conditions. However, all sets in the kernel from the kernel are equivalent: they provide the same (or almost 
the same) cost function's value and almost the same evolution of the model's solution after the end of 
assimilation. Consequently we can not talk about physical meaning of the optimal conditions. Indeed, 
any value from the kernel can be obtained as the result of the minimization procedure while only one 
of them may have a physical meaning. In this paper we take this ambiguity into account and consider 
the adjoint boundary condition estimation as a simple compensation of the model errors rather than real 
identification of model parameters. 

Particular attention is payed to the design of the tangent and adjoint codes. As it has been shown 
in [13] , the derivative of the model with respect to boundary conditions is two or three times longer (as 
well as in terms of the development, the number lines of the code and the necessary CPU time) than 
the derivative used to control the initial point of the model. The usual tangent model describes a linear 
evolution of a small perturbation to a model solution while the derivative with respect to other parameter 
includes also a block that introduce the perturbation into the model. This block may be at least as long 
and complex as the whole derivative with respect to initial conditions. 

That's why we focus our attention on the automated differentiators. In this paper tangent and adjoint 
codes have been obtained by Tapenade software [Jj. Current version of this software reveals to be able 
to produce the derivative of such a complex code as ORCA-2. 
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The paper is organized as follows. In the second section, 2° global ocean configuration (ORCA-2J±| 
of the Nucleus for European Modelling of the Ocean (NEMOjl model is described. The third section 
is devoted to the data assimilation technique and to the Automatic Differentiation Engine Tapenade[f| 
software to generate the adjoint and to the optimization of the generated code. Results and discussion 
are presented in the fourth section. 



2 ORCA-2 configuration of the NEMO and its discretization. 

ORCA-2 configuration of the Nucleus for European Modelling of the Ocean [25] is used in this paper. 
This is 2° global ocean configuration based on the OPA 8.2 [53] primitive equations model. 

Its domain extends from 78°S to 90°N. The model grid counts 182 x 149 x 31 nodes. Vertical dis- 
cretization is performed on z levels with the partial step approximation of the bottom cell [T] . Vertical 
mixing is achieved using the turbulent kinetic energy (TKE) scheme described in [2J. 

Spatially discretized equations of the ORCA-2 model are written as follows: 



' S x S y v S y (co + f) - D x x l v - S z ( S x wD z u ) + D X A* V £, + D y A* y uj - 
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Lateral diffusion 
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Vertical diffusion 



w(x, y,z — surface) (5) 
£ = D x u + D y v, uj = D y u~D x v, w= £(x,y,C)d(, p = p(T,s) (6) 

J bottom 

where operators D and S are derivatives and interpolations on the Arakawa C-grid. These operators will 
be discussed below in details. 

The set of variables in this system is composed of u, v and w that represent zonal, meridional and 
vertical velocity components, T and s denote the potential temperature and salinity, £ and uj are horizontal 
divergence and vorticity, r\ is the sea surface elevation and p is the density anomaly that is defined as a 
function of the temperature and salinity by the state equation. As one can see, u, v, T, s, rj are prognostic 
variables while vo, £ , uj and p are diagnostic ones. 



1 http: / / www.mercator-ocean.fr / eng/ science / composantes-systemes / modclisation/orca2 
2 http:/ /www. nemo-ocean. eu/ 

3 http: / / www-tapenade. inria. fr : 8080/ tapenade / index.j sp 
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Among parameters in these equations, we can see / = 2f2 sm(latitude) that represents the Coriolis 
parameter, gravity acceleration g = 9.81 ^J, lateral diffusion coefficients 



^ = 2000^-, A%y = 



40000^-latitude > 15° 
2000^- latitude < 15°, 



To calculate coefficients of the vertical diffusion A z u , A*, A^, A z s we use the turbulent closure scheme 
accompanied by the double diffusive mixing and enhanced vertical diffusion approximations. 

The turbulent closure scheme is applied to solve the problem of statically unstable density profiles. 
The vertical eddy viscosity and diffusivity coefficients are computed from a TKE turbulent closure model 
based on a prognostic equation for e, the turbulent kinetic energy (|7|), and a closure assumption for the 
turbulence length scales [24] . 
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where N is the local Brunt- Vaisala frequency which is calculated as a function of T and s. Parameters l e 
and l K are the dissipation and mixing length scales, P r t is the Prandtl number. The constants C k = v2/2 
and c e =0.1 are designed to deal with vertical mixing at any depth [4]. Following [2], P r t is defined as a 
function of the local Richardson number, Ri\ 



Prt — 




if R, < 0.2 
if 0.2 <Ri<2 
if 2 < Ri 



In frames of the enhanced vertical diffusion parameterization, we assign very big values to the vertical 
eddy mixing coefficients in regions where the stratification is unstable (i.e. when the Brunt- Vaisala 
frequency is negative) |16j . This is done on both momentum u, v and tracers T, s: 



A z = { if TV 2 < then A z = 100} 



(10) 



Double diffusion occurs when relatively warm, salty water overlies cooler, fresher water, or vice versa. 
They contribute to diapycnal mixing in extensive regions of the ocean. The parameterization of such 
phenomena was included in a global ocean model and it was shown that it leads to relatively minor 
changes in circulation but exerts significant regional influences on temperature and salinity in [27 . 



. -7i? p (i+ 1 ( ( Rp/i.6)«) if R p > 1 
a t = A Z T + <j A ddm if < R p < 1 
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I i+(flp/i.6) e if -R p > 1 

K = A z s + l (1.85 R p - 0.85) A ddm if 0.5 < R p < 1 (11) 
Q.l5R p A ddm if 0<i? p <0.5 

A z u = max(^,4,^) 

A^ — max(yl v , Arp, A s ) 

where A ddm — 1.3636 x 10~ 6 e 4 - 6e ( /Rp ' and R p is the buoyancy ratio R p ~ d z T/d z S. 

The term T c dtrj in the equations (JIJ and ([2]) is introduced to dump the external gravity waves. These 
waves are fast so their timescale is short with respect to other processes described by the primitive 
equations. Explicit resolution of these waves requires too short time step which is unnecessary to all 
other physics. Consequently, the filter of temporally unresolved external gravity waves, proposed in |31j . 
is introduced into the model. The cutoff time T c is equal to one time step of the model. 

The model is discretized on the grid, that is the generalization to three dimensions of the well-known 
"C" grid in Arakawa's classification [2S]- The arrangement of variables is the same in all directions. It 
consists of cells centered on scalar points (T, s, 77, p) with vector points (m, u, w) defined in the center 
of each face of the cells. The relative and planetary vorticity, uj and /, are defined in the center of each 
vertical edge. 
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To perform interpolations on this grid and to calculate the derivatives, we introduce operators S and 
D in the discretized equations (HJ)-®- These operators plays the key role in this study because they 
depend on the boundary conditions and introduce them into the model. Despite these operators are 
usually written as overbars and S brackets (see |25j . for example), we note them as letters with index in 
order to emphasize that these are operators under control in this paper. Interpolations are calculated as 
a weighted mean of two function values in the adjacent nodes. Weights are defined to be proportional to 
the grid steps of corresponding cells in order to ensure the second order interpolation of a grid function. 
These weights are described in details in the [25] . but we omit them in this paper to concentrate our 
attention on the principal features. Thus, writing operators in a simplified way, we assume both the 
argument and the result of the interpolation operator and the derivative are multiplied by an appropriate 
weight. 

At each grid- node in the ocean they we write interpolations S x , S y and S z in a common way: 

/ c \ _ u i,j.k + Ui+l , . u i,j,k + Ui,j+l,k 

\o x u)i+l/2,j,k — 2 ' K h y u hd+l/2,k — 2 ' 

(S z u) itj , k+1/2 = ^ fc+1 2 +U ' J ' fc (12) 

Following |12j . |14j . these expressions are modified in the grid-nodes adjacent to the boundary, i.e. near 
the continents for interpolations in x and y directions and near the bottom and the surface for vertical 
interpolation. 

Let us suppose the index i — corresponds to left rigid boundary. That means the index i = 1 is the 
first grid node in the ocean. In this case, we use the formula 

(S>)i/2j,fc = a$ ux ' + af uxl u 0dik + al uxl ux^ k (13) 

to calculate the interpolated value of u at the point 1/2, j, k, where scalar variables T, s, p are defined. The 
value of (S x u)N-i/2,j,k near the right boundary is calculated by the similar formula, but with different 
coefficients a^ uxT , af ux ' , af ux ■ 



"o 



■1 1/2 

-e h 



Ui 



^3/2 

-e 1 — 

U2 



T5/2 

-e h 



"3 



+ J JNT— 3/2 ■ L N-l/2 

H e 1 e 1 e 1 



UN-3 U N -2 



U N 1 



UN 



Figure 1: Structure of the horizontal grid. 
Similarly, interpolated value of T, s, p at points i = l,i = N — 1 are calculated by 
{ox-i )i,j,k — «o + a i ±i/2,j,k + ot 2 H/n,j,k 

\O x l )N-l,j,k = + a l 1 N-l/2,j,k + OL 2 1 N — 3/2, j,k 

Coefficients a play the role of the control variables in this paper. Operators are allowed to change 
their properties near boundaries in order to find the best fit with requirements of the model and data. To 
assign all control variables we shall perform data assimilation procedure and find their optimal values. 

The first coefficient, olq, is added to the interpolation formula to simulate non- uniform boundary 
conditions. Coefficient a\ controls the contribution of the physical boundary value that is prescribed to 
u o.j,k- It should be noted here, that if impermeability condition is imposed, Uq j,k — 0> the interpolation 
(flU)) is controlled by two a only. 

Similar modification of the interpolation formulas are performed near the north and the south bound- 
aries in the y direction. Exception is made for the periodical conditions. They are applied on the 78°E 
longitude and near the North Pole. In this case no modification is made and no control is applied. 

In the vertical direction we control the discretization at two points: one point near the surface and 
another one near the bottom (see figd]). To calculate the interpolated values of the temperature, for 
example, we use the following formula: 

fd rp\ „SzT s , „SzT s rp ,„SzT s rp 

{Jz-l = «0 + a l + a 2 1 i,3,3/2 

{SzT^.k-x = a s a zTb + a?* Tb T idi K _ 1/2 + af T "T iJt K _ 3/2 (14) 

We note here, that the value of the temperature (and other variables defined at the temperature 
levels) is not extrapolated to nodes wo and wk where physical boundary conditions must be prescribed. 
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Figure 2: Structure of the vertical grid. The number of layers depends on the longitude and latitude: 

K = K id 



These conditions participate in the interpolation from w levels to T levels 
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(15) 



Discretized derivatives near the boundary are defined in a similar way as for the horizontal derivatives 
and for the vertical ones. 



(D x u)i/ 2 ,j,k 



,Dux l 



„Dzw° 

a 



Dm 1 



Dm' 



®2 ZW w t,j,K-\ 



(16) 



As well as above, impermeability conditions are prescribed at lateral boundary for u and v. That means 
coefficients a\, being multiplied by the vanishing value of the function on the boundary, are not under 
control in the experiment. On the other hand, boundary condition on the surface of the ocean, prescribed 
for vertical derivatives, are really used in the model and controlled by corresponding coefficient af 2 . 

Taking into account the fact that at different points optimal boundary conditions may be different, 
we accept the coefficients a may vary from point to point. Thus, a Dz and a are considered as functions 
of the longitude and the latitude. Coefficients used in horizontal operators are also allowed to vary from 
one boundary point to another. 

Along with the derivatives and interpolations, boundary conditions influence also the second deriva- 
tives in the vertical diffusion and reconstruction of the vertical velocity w from the horizontal divergence 
(j6|). Discretized equivalent of the ([6]) writes 



Wi,j,K-l 
w i,j,k-l 

W i,j,l 



W W 1 /• 

a Q — a x HZij y K-il;ij,K-i 

Wi,i,k - hzij t k-i^i,j,k-i :2<k <K — 1 



(17) 



a i hzij^ij^ 



One can see the impermeability condition on the bottom may be violated by the term «q in the 
approximation of the vertical velocity near the bottom Wij^K—i that may allow non-zero vertical velocity. 
Coefficient a™ allows to control the thickness of the surface and the bottom layers. 

Control of the boundary conditions for the second derivatives D zz in ([1]) is performed in a slightly 
different way. We add ao to the prescribed boundary conditions on the surface and on the bottom 
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and modify the approximation of the second derivatives at the nodes, adjacent to the surface and to the 
bottom multiplying the finite differencing coefficients by <x\ and 02. Thus, for example, approximation of 

@rM writes 
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The value of hz k corresponds to the difference of depths between adjacent layers where u, v, T, S are 
defined (see figEJ. The grid step hz k +\/2 = ^ Zfc +^ lz k+i ^ g ^ e distance between layers where the vertical 
velocity Wk and w k +i are defined. The vertical grid is not uniform, so the gridsteps are not equal one to 
another. 

As well can see, impermeability is no longer imposed on the bottom and the temperature and salinity 
of the bottom are modified. The wind tension on the top is controlled for velocities and an additional 
control flux of T and S is added on the surface. Parameters a\ and oli helps also to control the position 
of the boundary modifying the depth of the first and the last layers. 

Thus, controlling boundary conditions in this paper, we control, in fact, the discretization of operators 
near the boundary determined by the set of coefficients a. In this paper, we look for optimal values of 
a for the following 24 operators that approximate either derivatives or interpolations in the horizontal 
plane: 
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D x 


D x u divg. 


D x v vort. 


D x (u 2 + v 2 ) adv. 


D x uj dissip. 




D x (uS x T,uS x s) 




D x p press. grad. 




Dy 


D v u vort. 


D x v divg. 


D v {u 2 + v 2 ) adv. 


DyUj dissip. 






Dy(vSyT,vSyS) 


D y p press. grad. 




s x 


S x u 2 Kin. en. 


S x v Coriolis 


S x s adv. 


S x (S y u) Coriolis 








S X T adv. 


S x uj adv. 


Sy 


S y u Coriolis 


S v v 2 Kin. en. 


SyS adv. 


S y {S x v) Coriolis 


S y T adv. 


SyUj adv. 



Table 1 : Controlled operators in x — y direction 



Total set of controlled a counts 2 000 808 coefficients. 

In the vertical direction, we control the discretization near the boundary of 4 derivatives and 5 
interpolations (see table [5]) accompanied by two approximations of the second derivative D zz u, v and 
D ZZ T, s according to (fT9|) and by the reconstruction of the vertical velocity w (fl7|) . The set of controlled 
coefficients counts 1 197 792 elements in this case. 



Argument 


Operators 


T,s 


D z u adv. 


D z v adv. 


S Z T adv. 


S z s S z p 


w 


D z wT adv. 


D z ws adv. 


S z (wD z u) adv. 


S z (wD z v) 



Table 2: Controlled operators in z direction. 



We can see the number of control variables is comparable in both cases. These numbers are also 
comparable with the dimension of the system state (1 707 245 variables) that we need to control identifying 
the initial conditions of the model. 



3 Data assimilation. 

One of the principal purposes of variational data assimilation consists in the variation of control parame- 
ters in order to bring the model's solution closer to the observational data. This implies the necessity to 
measure the distance between the trajectory of the model and the data. Introducing the cost function, we 
define this measure. Generally speaking, the cost function is represented by some norm of the difference 
between the model's solution and observations accompanied by the difference with the background that 
is used as regularization term. 
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3.1 Cost function 



To define the cost function we introduce dimensionless state vector cj> that is composed of five variables 
of the model = {w u m, w v v, wtT, w s s, w^ry}' weighted by coefficients w. The choice of these weights 
that are used to normalize the model variable will be discussed later. The distance between the model 
solution and observations is defined as the Euclidean norm of the difference 

e = e a c#p,t)) = - = (20) 

m 

where % is the operator that interpolates the model solution to the observation point and the sum is 
performed over all available observations at time t. 

In this expression, we emphasize the implicit dependence of £ on time and on the set of the control 
parameters p that is composed of 

• the set of initial conditions of the model <pa — {u \t=o, v \t=o, T \t=o, s |t=o, t) |t=o} ; 

• the set of the coefficients a xy that controls the discretizations of horizontal operators (Table [1]) in 
the vicinity of the continents, 

• the set of the coefficients a z that controls the discretizations of vertical operators (Table [2]) near 
the bottom and near the surface, 

• Vertical diffusion coefficients A^, A*, A^, A* 

The cost function is composed of two terms. One of them measures the distance from the observa- 
tions, and another one is added to avoid irregular solution that may occur due to lack of observational 
information and the ill-posedness of the problem. Indeed, the dimension of the model state is bigger 
than the quantity of available observations. That means we can not identify the model state in regions 
where observational information is absent. To avoid this problem, we add the background term in the 
cost function. This term allows us to determine the solution everywhere requiring that it must be close 
to the specified background. The regularization term we use has a form 

l b9r ( P )=J2(Pm-PnT) 2 (21) 
m 

where p m is the set of parameters under control and p h ^{ r are background values of these parameters. 
In this paper we use initial guesses as background for all parameters. That means, when we look for 
the optimal initial state of the model, we use original unperturbed initial conditions as initial guess for 
the minimization and as the background as well. Looking for optimal a, we use combinations ao = 0, 
a.\ = = 1 for derivatives and olq = 0, ot\ = a.i = 1/2 for interpolations both for initial guess and for 
the background. 

The difference model-observations (|2T)]) contribute to another component of the cost function. Taking 
into account the results obtained in 13] , we define the cost function as 

T 

X obs (p)= Jte(4>{p,t))dt (22) 
o 

that gives a bigger importance to the difference £ 2 at the end of assimilation interval. So far, we perform 
the data assimilation in order to make a forecast, we need a "better" estimate of the model state at the 
end of the assimilation window because this state is used as the initial point for the forecast that starts 
just after the assimilation. For this purpose, we require the model to go closer to observations at the end 
of the assimilation window increasing the weight of the distance in the cost function. 

To search for a minimum of the cost function, we shall use its gradient with respect to control 
parameters. The gradient of the background term (f2"Tj) can be calculated easily 

Pffbgr 

(VX)%r = 2{p n -p b r) (23) 

OPn 



The gradient of the second component of the cost function (|22l) can be calculated as a Gateaux derivative 
of an implicit function: 



d(j) m dp n 



dt 
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]T((^) m -c hs )§^V (24) 

because the derivative jrf— can easily be calculated from ([20|) : jj-t — = 2{{H(f>) m — 0m S )- The second 



term in (|24|) . ^ m , represents the matrix of the tangent linear model that relates the perturbation of 

mth component of the model state vector <j) m to the perturbation of the parameter p n . This relationship, 
of course, is assumed in the linear approach, that means it is only valid for infinitesimal perturbations. 
In the classical case, when initial conditions are considered as the only control variable, the derivative 

^ < Qp^ — i s the classical tangent model that describes the temporal evolution of a small error in the 

initial model state. The matrix is a square matrix that is widely studied in numerous sensitivity analyses. 
Its singular values at infinite time limit are related to well known Lyapunov exponents that determine 
the model behavior (chaotic or regular) and the dimension of its attractor. 

In our case, the matrix is rectangular in general. It describes the evolution of an infinitesimal 

error in any parameter (including initial state). However, we can study its properties in the similar way 
as we do with the classical tangent linear model. Its structure and composition is described in P3] for the 
case of using coefficients a as control parameters and in for the case when the bottom topography is 
used to control the model solution. 

The product ^2 m {{H4>)m ~ 4>m, s ) Qp 1 m represents an unusual vector-matrix product. To calcu- 
late this product directly we would have to evaluate all the elements of the matrix. This would require 
as many tangent model runs as the size of the state vector is. So, instead of the tangent model, we shall 
use the adjoint one that allows us to get the result by one run of the model. Backward in time adjoint 



model integration that starts from {4> — (f> ) provides immediately the product yjj^ ) {4> ~ <f> ) which 

is exactly equal to ((/> - <j) obs )^ in HMD- 
Using these notations, we write 



T 

VJ ,„v. 2 f t(?^p) (0(p,t) - r bs (t))dt (25) 



where the expression in the integral is the result of the adjoint model run from t to starting from the 
vector (<j>(p,t) - (f> obs (t)). 

3.2 Adjoint model 

Tangent and adjoint models have been automatically generated by the Tapenade software [7] developed 
by the TROPICS team in INPJA. This software analyses the source code of a nonlinear model and 

produces codes of its derivative and of the adjoint [ ^ 
r op ■'yap 

The obvious advantage of the automatic tangent and adjoint code generation consists in avoiding of 
huge development work. This fact is much appreciated especially in the case of control of internal model 
parameters other than initial conditions. As it has been shown in |14j . the derivative of the model with 
respect to boundary conditions is composed by two blocks: one of them is composed of operators acting 
in the space of the model variables and another one is composed of operators linking the model variables 
and the controlled parameter. The first block is responsible for the evolution of a small perturbation 
by the models dynamics and is similar for any data assimilation. The second one determines the way 
how this perturbation is introduced into the model and is specific to the particular parameter under 
control. This block is absent when the goal is to identify the initial conditions of the model because the 
uncertainty in initial conditions is introduced only once, at the beginning of the model integration. But, 
when the uncertainty is presented in an internal model parameter, like in this case, the perturbation is 
introduced at each time step of the model. 

Consequently, controlling internal model parameters we must write an additional block for the tangent 
model which is at least as complex as the whole tangent model for the initial point (see equations 5 and 
6 in [14] where these blocks are compared for a shallow- water model). Moreover, controlling internal 
parameters we may be brought to the necessity to enlarge the control set, to study the influence of some 
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other parameter. Simple automatic generation of tangent and adjoint codes helps us to be free in the 
choice of parameters to control. 

On the other hand, automatically generated code frequently suffers from excessive requirements to 
computer memory. To be able to integrate the adjoint model backward in time, we have to keep the for- 
ward trajectory. There have been proposed in 35 to use Griewank and Walthers binomial checkpointing 
algorithm [5] , that assumes to keep forward model solution at several time-steps only and to recalculate 
all other time-steps during the adjoint model run. The spacing of time-steps where we keep the forward 
model solution should be chosen in order to minimize the calculation time. It has been shown in [35] that 
the slowdown of the adjoint model can be limited by the factor 7 for the assimilation window of 1000 
time-steps while keeping only 27 instantaneous model states. 

In this paper, we start from the analysis of the adjoint code obtained from the Tapenade software 
and removing from keeping an unnecessary data. So far, the time-stepping is performed by the leap-frog 
scheme, the adjoint code saves two time levels at each step: the nth and the n — 1th. However, the n — 1th 
layer is unnecessary to save because it was saved on the previous step. We avoid also saving the values 
of all variable on the continents because we do not need them at all. This divides the required memory 
approximately by two. We do not keep in memory all forcings: they have been read from files during the 
forward model run, they will also be read when necessary during the adjoint run. Despite the model is 
written with double precision, it reveals to be sufficient to keep the trajectory with the single precision 
only. Moreover, we can keep odd time-steps only, retrieving all even time-steps by interpolation. This 
interpolation perturbs only a little the gradient dividing the required memory by two. 

Summarizing all the efforts, we can divide the memory, required for keeping the forward trajectory, by 
25 ensuring that 10 days assimilation window of the ORCA2 model fits into 640 MB space. Taking into 
account possible use of the binomial checkpointing algorithm, we can state that automatic generation of 
the adjoint code by Tapenade can be used even for finer resolution models and for longer assimilation 
windows. In these cases we will be limited by a long computational time rather than by an excessive 
required memory. Massively parallel version of the adjoint code will, consequently, be necessary, but it 
is beyond the scope of this paper. 

3.3 Discontinuous diffusion coefficients 

Another problem we face trying to apply variational data assimilation to the ORCA-2 configuration 
consists in the discontinuous coefficients of the vertical diffusion. As we have seen in expressions (|9]), (fT0| 
and (fTTj) . coefficients A z depend on the buoyancy ratio R p and on the Brunt- Vaisala frequency N 2 in a 
discontinuous way. Infinitesimal variation of the Brunt- Vaisala frequency from — e to +e in some grid 
point results in a drastic change of all diffusion coefficients A z from values of order 10~ 5 ...10~ 4 up to 
100. Similar effect results from a small variations of the buoyancy ratio in the vicinity of 0.5 and 1. This 
implies the whole system is discontinuous and, hence, not diffcrcntiable. We can neither calculate the 
tangent model, nor its adjoint and the variational data assimilation can not be performed. This problem 
has already been reported in relation with the atmospheric model in |10) . 

So, we have to modify the diffusion coefficients in order to make the model difierentiable. Unfortu- 
nately, a simple smoothing of the diffusion coefficients can not solve the problem. So far, the jumps of the 
coefficients values may exceed 10 6 , any reasonable smoothing modifies a lot low values of the coefficient in 
the adjacent points. Instead of values of order 10~ 5 ...10~ 4 we get 10 _2 ..10 _1 that dumps all the vertical 
physical processes. 

To be able to perform variational data assimilation, we have to suppress completely the enhanced 
viscosity (fTU)) and the double diffusive mixing (fTT|) . The turbulent closure scheme can, however, be 
smoothed replacing A% = max(A§, CkhV&) hi ® by 



Of course, this modifies the model solution. In particular, the suppression of the enhanced viscosity that 
dumps unstable situations with imaginary Brunt- Vaisala frequency may lead to uncontrolled instability 
in the future. The question of the smooth diffusion coefficients that we can use in the variational data 
assimilation must be considered in details, but this long study is beyond the scope of this paper, where 
we focus our attention on the influence of the control parameters on the solution. 

In all the matter below, we shall use the model with smooth vertical diffusion coefficients as the basic 
model to be compared with models subjected to modified initial conditions or modified parametrizations of 
the boundary conditions. Controlling coefficients A z , we shall also start from the smooth ones comparing 
them to the obtained in the data assimilation process. 




(26) 
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3.4 Assimilated data and background fields. 

The observational data set for this model consists of the set of the ECMWF data representing sea surface 
anomalies issued from Jason- 1 and Envisat altimetric missions and the set of vertical profiles of the 
Temperature and salinity from ENACT/ENSEMBLES data banque [5]. 

The sea level anomaly data set counts 112 thousands measured values distributed over 20 days interval 
(between the 1st and the 20th of January 2006). The distribution of the data is not uniform in time: we 
have about 6700 points per day during first 10 days and 4500 points per day after the 10th of January. 

The temperature and salinity profiles counts about 200 000 measures during the same 20 days: more 
than 12000 observations per day before the the 10th of January, and about 6000 per day after this date. 
The quantity of data is not uniformly distributed within each day also. Some time steps of the model 
(1/15 day) receive more than 5000 data to assimilate, some other steps got 1500 data points only. 

The background fields we use in the assimilation (1211) depend on the parameter under control. In this 
paper, the initial guess for the minimization and the background were always the same. Thus, 

• the set of initial conditions of the model representing the model state on the January, 1, 2006, 
OOh GMT is used as initial guess and background when we control the initial point of the model 
00 = {u |i=o, v | t= o, T | t=0 , S | i=0 , T) |i=o}; 

• the set a xy — 0, af xy — a^ xy — 1/2, a^ xy — a® xy = 1 is used as initial guess and background when 
we controls the discretizations of horizontal operators (Table [T]) in the vicinity of the continents; 

• the set of the coefficients = 0, af z = af z = 1/2, af z = ol® z = 1 that controls the discretizations 
of the vertical operators (Table [2]) near the bottom and near the surface. This set is accompanied by 
Oq — 0, af — 1 used to calculate the vertical velocity w in (|17p and by a^ zz = 0, a® zz = a® zz = 1 
used to approximate the second vertical derivative in (jT9j) ; 

• the instantaneous values of the vertical diffusion coefficients A z u , A z , A^, A z s on the 10th of Jan- 
uary OH GMT are used as initial guess and background when we look for optimal values of these 
coefficients. 

4 Results 

In this section we perform four experiments of data assimilation for identification of optimal values of 
four sets of parameters described above. All experiments are performed following the same algorithm 
and the same software. In each experiment one parameter is controlled, all other parameters are kept 
equal to their background states. 

The data assimilation is performed by minimization of the total cost function 

l(p) = \Q- A l b 9 r ( p ) + X ohs (p) (27) 

composed of (|21l) and (|2"2"j) using the minimization procedure developed by Jean Charles Gilbert and 
Claude Lemarechal, INRIA [5]. This procedure uses the gradient of the cost function (|23|) . ((24]) in the 
the limited memory quasi-Newton method. 

We start from assimilation window T = 5 days and allow the minimizer to make 20 calls of the 
routine that calculates the cost function and its gradient. Optimal values of the controlled parameters 
obtained as the final state of the minimization is used as initial guess in the second minimization, that 
is performed with the window T = 10 days and the minimizer is allowed to make 40 calls. The set of 
parameters obtained as this final result we shall call the optimal set. 

Of course, this optimal set is optimal within the assimilation window only. However, we shall analyze 
the model behavior beyond the window, and namely 10 and 20 days later, on the 20th and 30th of 
January. 

4.1 Comparison of different control parameters 

Convergence of the cost function during the minimization procedure is shown in the fig(3K. Despite 20 
cost functions call were allowed in all experiments, the minimizer has made less iterations, i.e successful 
updates of the controlled parameter. 

The worst convergence is obtained when the parametrization of the lateral boundary conditions is 
controlled (solid line in fig[3]A Minimizer performs several iterations with both T = 5 and T = 10 days 
windows, but no visible reduction of the cost function is obtained. The same phenomenon we see in 



10 



figG33, where the evolution of the distance £(t) (|20"|) is shown during the assimilation and 10 days after 
its end. The dotted line (the model with the optimal parametrization of the lateral boundary conditions) 
is almost indistinguishable from the solid line that corresponds to the original model. 

This fact shows that lateral boundary conditions play no role in the ORCA-2 model. It is useless 
to control them hoping to improve the solution or to bring it closer to observations. This fact is in 
contradiction with the result obtained in [T31 H3] where the influence of the lateral boundary conditions 
on the solution of the shallow-water model is very important. But this contradiction can be evidently 
explained by different resolutions and different lateral dissipations. Indeed, the shallow water model 
discussed in [T4J [15] is an eddy resolving model with a developed turbulence while the grid step of 
ORCA-2 is approximately equal to 2 degrees (i.e. about 200 km) that can not be considered as eddy- 
resolving at all. Moreover, the lateral dissipation coefficient used for the shallow water model was either 

2 2 

50 or 200^-, while in the ORCA-2 this parameter is equal to A* v = 40000^- in extra-tropical region. 
Such a strong dissipation admits neither turbulence nor lateral boundary layers, i.e. configurations in 
which the lateral boundary conditions play an important role and worth to be controlled. 

So far, the lateral boundary conditions arc not important in this case, we shall not consider them as 
a controlled parameter below. 

When we control either initial state (f>o or vertical diffusion coefficients A z of the model we get similar 
results. Control of each parameter allows us to reduce the cost function value by more than 20% (two 
dashed lines with short dashes in fig|3]A.). In the fig|3j3 these two parameters show close behavior too (also 
two dashed lines with short dashes). The distance from observations is smaller than for the model with 
original parameters. That means the control of the initial state of the model and its vertical diffusion 
coefficients is important if we want to bring the trajectory closer to observations. We should note, that 
the trajectory remains closer to observations even after the end of assimilation. That means we can hope 
to get a better forecast with the optimal model. 
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Figure 3: Convergence of the cost function (|27j) during the minimization procedure (A) and the evolution 
of the distance £(t) (120j) during and after the assimilation (B). 

But the most influent parameter in this set of experiments is the parametrization of boundary condi- 
tions for vertical operators. If we control the conditions on the surface and on the bottom of the ocean, 
we get a rapid convergence of the cost function (dashed line with long dashes in figl3]4) and the shortest 
distance to observations (the same line in figOP). This indicates that vertical boundary conditions are 
very important in this case from the point of view of as the data reanalysis in the assimilation window 
and the forecast beyond the window. 

4.2 Modification of physical variables under control of vertical boundary con- 
ditions. 

In order to examine the modification of the physical variables produced by the use of optimal model 
parameters, we plot first the sea surface elevation in two regions: North Atlantic and North Pacific. 
These regions are characterized by the presence of strong jet-streams (Gulf Stream in the North Atlantic 
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and Kuroshio in the North Pacific) difficult to reproduce by low-resolution models like ORCA-2. The 
strength and the length of the jets are usually reduced by these models while the width is over-estimated. 

The sea surface elevation is the variable that reflects the integral impact of the jet on the model 
solution. The strength of the jet can be evaluated by the orthogonal derivative of the SSH while the 
length is represented by the length of the region with the big gradient. 

A. B. 




BO 70 60 50 40 30 20 10 

Contours from -1 to 0.4 interval 0.1 



Figure 4: Sea surface elevation in the North Atlantic on the January, 30. Optimal initial conditions (A), 
Optimal Vertical boundary conditions (B) and Optimal vertical diffusion coefficients (C). 

Three simulation have been performed for 30 days beginning from the 1st of January 2006. The sea 
surface elevation in the North Atlantic obtained as the final states are presented in the figHJ The first 
one (A) corresponds to the model that starts from optimal initial state. The second one is obtained with 
the optimal parametrization on boundary conditions of all vertical operators. Optimal coefficients for 
vertical diffusion were used to obtain the third result. 

One can see that (A) and (C) parts in the fig|3]look very similar. Optimal diffusion coefficients move 
the Gulf Stream slightly to the South increasing by 10 cm the positive anomaly on the South and reducing 
the negative anomaly on the North by the same 10 cm. The length and the strength of the Gulf Stream 
is the same in both pictures. 

The modification of the circulation produced by the optimal parametrization of the vertical boundary 
conditions is much more visible in the fig 0)3. The positive anomaly is reinforced and moved to the North- 
East. The beginning of Gulf Stream is moved slightly to the North, becomes narrower and its force 
increased as well as its length. This case clearly represents much better solution from the physical point 
of view. 

Similar effects can be observed in the North Pacific in figEl slightly reinforced positive anomaly on 
the South of Kuroshio and reduced negative one on the North in (A) and (C) parts showing similar 
influence of the controlled parameters on the solution. And also, the figll}3 obtained with the optimal 
boundary conditions for vertical operators is strongly modified. Like Gulf Stream, Kuroshio become 
longer, stronger and narrower. Moreover, it becomes also more tortuous that seems like a simulation of 



12 



A. 



B. 
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Figure 5: Sea surface elevation in the North Pacific on the January, 30. Optimal initial conditions (A), 
Optimal Vertical boundary conditions (B) and Optimal vertical diffusion coefficients (C). 



moving eddies. This is also a good result obtained in frames of a coarse resolution model. 

The immediate idea one can have looking at these pictures that it is an artifact. Indeed, the sea 
surface elevation r\ is directly related to the vertical velocity on the surface (see ©). We control w on the 
surface adding control coefficients a$ and a™ (see (fl7|) ). So far, much observational data is collected by 
satellites and available for the sea surface elevation, the variable rj might undergo a strong forcing pulling 
it toward observations. It is possible, consequently, that data assimilation modifies just coefficients a w " 
on the surface for w velocity and this modification is immediately translated to 77 with no influence on 
all other variables. 

However, analyzing the patter of a w we can not see any significant modification that can influence 
the jet streams. First, the magnitude of is of order 1CP 7 while the velocity w on the surface is about 
10~ 5 ...10~ 4 . Second, no characteristic patterns of jet streams can be observed in the fields a™ and a™ . 
And third, the perturbation of the vertical velocity by af on the bottom is bigger than on the surface. 

Values of the velocity w in the vertical y — z section by the 60°W meridian are shown in fig|6] This 
section is indicated by the thick vertical line in figHf3. One can see strong values of w, reaching 1.5^^, 
near the bottom of the ocean. These values can't be observed in the original model because of imper- 
meability boundary condition imposed for w and no-slip condition for horizontal velocity components: 

u I bottom = v I bottom = °- 

Consequently, we shall state that it is the modification of the bottom boundary conditions that 

influences the jet streams near the surface. 

An interesting question we may ask, whether it is the vertical velocity itself that is modified near 
the bottom, or the modification is due to divergent modification of lateral velocities u and v. In other 
words, whether coefficients a create fountains at the bottom that spray out jets of water, or they create 
three-dimensional eddies in which the divergence of lateral velocities £ ([5]) is balanced by . 

To answer this question, we plot the u velocity in the same y — z section in figlZj3 (velocity u is 
orthogonal to the picture plane) and we see that the velocity u also has anomalies near the bottom in 
the same places as the velocity w, and namely in regions 18°...21°N and 24°...27°N. That indicates the 
source of the w velocity may lie in the divergent part of the lateral velocity components. Moreover, in 

4 This picture is not presented. For m ore color pictures and movies, please see 

http: / / www- ljk.imag.fr / membres /Kazantsev/orca2/index.html 
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Figure 7: Velocity component u in y — z plane for x = 60 W: obtained with optimal initial conditions 
(A) and optimal parametrization of vertical boundary conditions (B). 



figEP, one can see an important anomaly of u near the bottom (39°...41°N) that is almost non divergent 

(the derivative $jj is compensated by t^) producing only a little anomaly in w. 

Moreover, in fig[7)3 one can easily see the influence of anomalies of u and v near the bottom on the jet 
stream near the surface. The positive anomaly of u just below the jet stream (39°...41°N) maintains the 
flux over the whole water column ensuring more than 10^p velocity at the 500 m depth. Comparing the 
velocity pattern with the velocity obtained in the experiment with optimal initial conditions presented in 
figlZRi we see that u can hardly reach the value of 5^|p at the 500 m depth. That means the Gulf Stream 
become not only stronger (one can see the surface velocity exceeds 35 instead of 25 ^jp in figiZK) and 
longer, but also deeper. 

This fact brings us to the conclusion that the major impact of the data assimilation is made on the 
parametrization bottom boundary conditions for lateral velocities. Consequently, the modification of the 
sea surface elevation in the jet-stream regions is not a simple artefact, but the result of optimization of 
the deep dynamics. 

Finally, we should note that the temperature and salinity fields has been modified only a little in the 
data assimilation. We can see almost no difference with the original fields near the bottom, and a little 
difference near the surface, especially in the equatorial region. 

As we have seen in figEl optimal coefficients A z allows us to bring the model solution closer to 
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observations. However, they don't amplify the stream jets figHJ figE] like optimal boundary conditions. 
Analyzing the modifications made by the data assimilation, we can see numerous small regions (and even 
points) where the viscosity become vanishing. This result seems not to be physically interesting. 

Of course, more pictures should be presented in this section in order to show in details the modification 
of different variables of the model. However, preserving the volume of the paper and taking into account 
the impossibility to present multimedia content in the Journal, the reader can find numerous color pictures 
and movies at |http: / /www- ljk.imag.fr/membres/Kazantsev/orca2 /inde xThtml| 

5 Conclusion and perspectives 

This paper shows possible advantages of extending of the now classical variational data assimilation pro- 
cedure to the control of the model parameters other than initial conditions. Controlling paramctrization 
of boundary conditions of vertical operators in the ORCA-2 model we have brought the solution closer 
to observations not only during the assimilation stage, but also during the forecast. We have shown also 
that the jet streams in the North Atlantic and the North Pacific are modified under control of the vertical 
boundary conditions. This result seems to be interesting because these jets are extremely difficult to sim- 
ulate in frames of a low-resolution model. Another important particularity consists in the fact that it is 
the bottom boundary conditions for lateral velocities u and v were subjected to the major modifications. 
That shows the importance of the appropriate approximation of the bottom topography and processed 
in the bottom boundary layer. 

Performing the control of parameters, it is not clear in advance what parameter is the most important 
for a given model. It was lateral boundary conditions in the case of the shallow- water model in [TH [T3] , 
but they play no role in the ORCA-2 model. Taking into account the effort made to construct the vertical 
diffusion coefficients A z for this model (including an additional evolution equation for TKE e(x, y, z, £)), 
one can suppose they might be extremely important in this configuration. However, their influence is not 
as strong as the influence of vertical boundary conditions. 

This fact helps us to understand that boundary conditions are only important to optimize in the case 
when the corresponding operator is important for the model dynamics. The solution of the shallow-water 
model discussed in [141 115) was exhibiting either developed turbulence or strong under- resolved boundary 
currents showing that horizontal operators and their boundary conditions should be considered carefully. 
In the case of ORCA-2, almost all the horizontal dynamics is dumped by the strong dissipation that leads 
to a little influence of corresponding boundary conditions. 

But, in the case of ORCA-2, the vertical dynamics is important. The dissipation and diffusion 

2 

coefficients are chosen as low as possible (sometimes as low as A z = 10~ 4 ^r- compared to A xy = 

2 

4 x 10 +4 -^- ) in order to preserve the vertical currents. This fine adjustment points out the importance 
of the vertical dynamics and requires, consequently, careful treatment of its boundary condition. The 
present study confirms this fact, showing the optimizations of the vertical boundary conditions modifies 
much the currents. 

Summarizing, we can say that the control of model parameters can help us 

• to compensate the model errors and to bring the model solution closer to observational data im- 
proving the forecast; 

• to see what parameters are more or less important for a given model and for a given configuration; 

• to see what geographical region requires a particular attention in formulation of the boundary 
conditions. 

The use of the automatic differentiation tool reveals to be extremely useful in this study helping us 
to avoid the huge coding and debugging work. This fact is very appreciated in the situation when we 
come out the frames of controlling the only classical model parameter — its initial state. Now, we can 
be brought to consider various parameters of the model as worthy to be optimized, facing the necessity 
to get the derivative of the model and its adjoint with respect to the chosen parameter. 

The major shortcoming of automatic differentiation tool that consists in an excessive requirement 
of computer memory (hundreds or thousands times of the model code) can be avoided by the memory 
usage optimization and by the binomial checkpointing algorithm [6] . These techniques bring the memory 
demand into a reasonable limit and the principal barrier to overcome becomes the usual computing time. 
Consequently, automatic differentiation of massively parallel codes and parallelization of the adjoints is 
necessary. 
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However, the present study can not be considered as a finalized control of the model parameters. 
Numerous points, that are extremely important from physical and mathematical points of view, have not 
been discussed in this paper. 

Standing on the mathematical point of view, we must address two principal questions about uniqueness 
and stability of the parameter identification. These questions arise because we are solving an inverse 
problem, that is non-linear and probably ill-posed. The uniqueness of the solution determines whether 
it is at all possible to obtain the value of unknown parameter from observational data. The notion of 
stability determines whether small errors in data would cause serious errors in the identified parameter. 

Both these questions have negative answer in this paper. The set of control coefficients was intention- 
ally chosen to be too large to ensure an unique solution. As it has been shown in [12], exceedingly large 
set of a leads to a non-null kernel of the Hessian and results in a non-unique choice of optimal boundary 
conditions. 

Moreover, the nonlinearity of the model solution with respect to boundary conditions is far from 
being quadratic. Consequently, the cost function I{p) (f27|) . along with the global minimum, may possess 
numerous local ones. Applying the minimization procedure, we shall find one of them instead of the global 
one that will result in a supplementary non-uniqueness of the solution. Incremental data assimilation 
technique may help us to improve the minimization by finding a deeper minimum, but it can not guarantee 
the minimum found is global. 

And finally, we can not find a minimum at all because we can perform only a limited number (few) 
iterations of the minimization process in a reasonable computing time. Even with ORCA-2, one of the 
simplest models of the Global Ocean, we could perform some 40 iterations. Hence, the obtained result is 
far from even a local minimum. 

Consequently, this study can not pretend to be an identification of the boundary conditions, but rather 
a way to compensate some model errors due to the lack of resolution and approximative parametrizations. 

The same conclusion we can make from the physical point of view also. First of all, coefficients a are 
not physical parameters. They have been chosen as controls because any imaginable boundary condition 
can be approximated by a combination of these three a. Even unphysical conditions, like a permission 
to cross the continent contours, are intentionally accepted because they may point out that the model 
geometry is not in the agreement with the model dynamics. Moreover, we accept different a for different 
operators even dealing with the same variable assuming the possibility of different boundary conditions 
imposed to the same variable and increasing the probability to get a non-null kernel of the Hessian when 
one a compensates one another. 

Thus, an exceedingly wide set of a is controlled instead of some physical set of boundary conditions in 
order to allow the data assimilation to modify any a bringing the model solution closer to observations. 

Consequently, the values of a them-self are neither important, nor physically meaningful in this study. 
That's why they are not plotted in this paper. But they show the result we can potentially get optimizing 
boundary conditions, the regions where this optimization is particularly important and which operators 
and which variables of the model should attract a special attention. Analyzing the modification of the 
physical variables, we can further reduce the quantity of the controlled a and proceed to the control of 
the physical boundary conditions including the position of the boundary that may should be moved from 
the current position. 
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